home *** CD-ROM | disk | FTP | other *** search
Text File | 1999-09-24 | 17.8 KB | 346 lines | [TEXT/MPS ] |
- #=======================================================================
- # FTP file name: CORPCHAR.TXT
- #
- # Contents: Registry (external version) of Apple use of
- # Unicode corporate-zone characters.
- #
- # Copyright: (c) 1994-1999 by Apple Computer, Inc., all rights
- # reserved.
- #
- # Contact: charsets@apple.com
- #
- # Changes:
- #
- # b03 1999-Sep-22 Update contact e-mail address. Matches
- # internal registry <b3> and Text Encoding
- # Converter version 1.5.
- # b02 1998-Aug-18 Expanded usage of 0xF8A0. Matches internal
- # registry <b3>.
- # n11 1998-Feb-05 Minor update to header comments
- # n09 1997-Dec-14 Update to match internal registry <n23>:
- # Add source hint 0xF850, transcoding hints
- # 0xF860-0xF86B and 0xF870-0xF872, deprecate
- # almost all other non-hint corporate
- # characters.
- # n08 1997-Jul-17 Update to match internal registry <n13>:
- # Add characters for Mac OS Chinese, Korean &
- # Farsi. Add CJK source hints. Deprecate some
- # characters in favor of combinations of
- # standard characters and transcoding hints.
- # Change header format.
- # n04 1995-Nov-15 Update to match internal registry <n8>:
- # Add characters for Mac OS Hebrew and Thai.
- # n02 1995-Apr-18 First version. Matches internal registry
- # <n5>.
- #
- # Standard header:
- # ----------------
- #
- # Apple, the Apple logo, and Macintosh are trademarks of Apple
- # Computer, Inc., registered in the United States and other countries.
- # Unicode is a trademark of Unicode Inc. For the sake of brevity,
- # throughout this document, "Macintosh" can be used to refer to
- # Macintosh computers and "Unicode" can be used to refer to the
- # Unicode standard.
- #
- # Apple makes no warranty or representation, either express or
- # implied, with respect to these tables, their quality, accuracy, or
- # fitness for a particular purpose. In no event will Apple be liable
- # for direct, indirect, special, incidental, or consequential damages
- # resulting from any defect or inaccuracy in this document or the
- # accompanying tables.
- #
- # These mapping tables and character lists are subject to change.
- # The latest tables should be available from the following:
- #
- # <ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/APPLE/>
- # <ftp://dev.apple.com/devworld/Technical_Documentation/Misc._Standards/>
- #
- # For general information about Mac OS encodings and these mapping
- # tables, see the file "README.TXT".
- #
- # Format:
- # -------
- #
- # Two tab-separated columns;
- # '#' begins a comment which continues to the end of the line.
- # Column #1 is the Unicode corporate character code point
- # (in hex as 0xNNNN)
- # Column #2 is a comment containing:
- # 1) an informal name describing the Unicode corporate character,
- # or if it is deprecated, information about what to use
- # instead.
- # 2) optionally, another '#', followed by information on which
- # Mac OS encodings use the Unicode corporate character, and -
- # if relevant - the Mac OS code points that correspond to the
- # corporate character.
- #
- # The entries are in Unicode order.
- #_______________________________________________________________________
-
- # The block of 16 characters 0xF850-0xF85F is for source hint characters.
- # These have no display (like zero-width no-break space). If they appear
- # in text, they can only be mapped to tables that include them. If a run
- # of Unicode characters such as Han characters could otherwise be mapped
- # to any of several encodings, including one of these hint characters can
- # force the text to be mapped only to an encoding whose mapping table
- # includes the hint character. Once they have forced mapping to a particular
- # encoding, they no longer apply (they don't need to be cancelled); if a
- # subsequent character cannot be mapped to that encoding, it may be mapped
- # to another encoding. Currently source hints are mainly defined for CJK
- # source disambiguation.
- # NOTE: These are only defined for application developers who have requested
- # them. The Mac OS Text Encoding Converter does not generate these when
- # converting from other CJK encodings to Unicode. However, it will handle
- # these characters correctly when converting from Unicode to other encodings.
- 0xF850 # source hint: Reset, try all candidate encodings in preferred order.
- 0xF85C # source hint: Chinese simplified
- 0xF85D # source hint: Chinese traditional
- 0xF85E # source hint: Japanese
- 0xF85F # source hint: Korean
-
- # The block of 32 characters 0xF860-0xF87F is for transcoding hints.
- # These are used in combination with standard Unicode characters to force
- # them to be treated in a special way for mapping to other encodings;
- # they have no other effect.
- #
- # 0xF870-0xF87F are "variant tags" - they are like combining characters,
- # and can follow a standard Unicode (or a sequence consisting of a base
- # character and other combining characters) to tag it so that it will be
- # unique, treated in a special way for transcoding. These always terminate
- # a sequence of combining characters.
- #
- # 0xF860-0xF86B are "grouping hints" - they precede a group of two to
- # four standard Unicode characters to indicate that they are treated as a
- # group for transcoding. This grouping overrides any other combining
- # behavior.
- #
- # Here are the ones defined so far:
- 0xF860 # transcoding hint: group next 2 characters # Japanese,Korean
- 0xF861 # transcoding hint: group next 3 characters # Japanese,Korean
- 0xF862 # transcoding hint: group next 4 characters # Japanese,Korean
- 0xF863 # transcoding hint: group next 4 characters, alt1 # Korean
- 0xF864 # transcoding hint: group next 4 characters, alt2 # Korean
- 0xF865 # transcoding hint: group next 4 characters, alt3 # Korean
- 0xF866 # transcoding hint: group next 4 characters, alt4 # Korean
- 0xF867 # transcoding hint: group next 2 characters, alt1 # Korean
- 0xF868 # transcoding hint: group next 2 characters, alt2 # Korean
- 0xF869 # transcoding hint: group next 2 characters, alt3 # Korean
- 0xF86A # transcoding hint: group next 2 characters, RL # Hebrew
- 0xF86B # transcoding hint: group next 4 characters, RL # Farsi variant
- #
- 0xF870 # transcoding hint: variant tag 16 # Symbol, Korean
- 0xF871 # transcoding hint: variant tag 15 # Symbol, Korean
- 0xF872 # transcoding hint: variant tag 14 # Symbol
- 0xF873 # transcoding hint: variant tag 13 # Korean, Thai
- 0xF874 # transcoding hint: variant tag 12 # Korean, Thai
- 0xF875 # transcoding hint: variant tag 11 # Korean, Thai
- 0xF876 # transcoding hint: variant tag 10 # Korean
- 0xF877 # transcoding hint: variant tag 9 # Korean
- 0xF878 # transcoding hint: variant tag 8 # Korean
- 0xF879 # transcoding hint: variant tag 7 # Korean
- 0xF87A # transcoding hint: variant tag 6 # Korean
- 0xF87B # transcoding hint: variant tag 5 # Korean
- 0xF87C # transcoding hint: variant tag 4 # ChineseTrad, Korean, Dingbats
- 0xF87D # transcoding hint: variant tag 3 # ChineseTrad
- 0xF87E # transcoding hint: variant tag 2 # Chinese,Japanese
- 0xF87F # transcoding hint: variant tag 1 # CJK,Symbol,Dingbats,Hebrew
-
- # The following (2) are metrics "characters" so applications can get the
- # height and width of double-byte character glyphs by measuring the glyph of a
- # one-byte character (e.g. calling CharWidth for character 0x82 in a Chinese
- # Traditional font); this approach assumes that the glyphs for all double-byte
- # characters in a font have the same metrics, which is currently true. Note
- # that the width-metric character glyphs are used differently for TrueType and
- # old-style bitmap fonts; for TrueType fonts the metric glyph width is equal
- # to the full width of a double-byte character glyph, while for FBIT/FDEF
- # bitmap fonts the metric glyph width is half the width of a double-byte
- # character glyph.
- 0xF880 # height-metric character for double-byte fonts # Chinese Simp&Trad-0x81
- 0xF881 # width-metric character for double-byte fonts # Chinese Simp&Trad-0x82
-
- # The following (2) are for the TrueType variant of Mac OS Farsi.
- # NOTE: 0xF883 is deprecated in favor of a combination of standard
- # characters and transcoding hint. The deprecated character will still
- # be loosely mapped to the appropriate Mac OS Farsi character.
- 0xF882 # Arabic ligature "peace on him" # Farsi(TrueType variant)-0x8B
- 0xF883 # deprecated, use 0xF86B+0x0631+0x064A+0x0627+0x0644 # Farsi(TrueType variant)-0xA4
-
- # The following (22) are for the Mac OS Thai encoding.
- # In this encoding, positional variants of upper vowels, tone marks,
- # and other marks are normally handled automatically by WorldScript I.
- # However, the Thai-DTP keyboard allows the codes for the positional
- # variants to be entered directly, so they must be treated as
- # characters. When the abstract character is treated as a positional
- # variant, it has the right (and high, if relevant) position.
- # NOTE: These are now all deprecated in favor of combinations of standard
- # characters and transcoding hints. The deprecated characters will still
- # be loosely mapped to the appropriate Mac OS Thai character.
- 0xF884 # deprecated, use 0x0E31+0xF874 # Thai-0x92
- 0xF885 # deprecated, use 0x0E34+0xF874 # Thai-0x94
- 0xF886 # deprecated, use 0x0E35+0xF874 # Thai-0x95
- 0xF887 # deprecated, use 0x0E36+0xF874 # Thai-0x96
- 0xF888 # deprecated, use 0x0E37+0xF874 # Thai-0x97
- 0xF889 # deprecated, use 0x0E47+0xF874 # Thai-0x93
- 0xF88A # deprecated, use 0x0E48+0xF874 # Thai-0x98
- 0xF88B # deprecated, use 0x0E48+0xF873 # Thai-0x88
- 0xF88C # deprecated, use 0x0E48+0xF875 # Thai-0x83
- 0xF88D # deprecated, use 0x0E49+0xF874 # Thai-0x99
- 0xF88E # deprecated, use 0x0E49+0xF873 # Thai-0x89
- 0xF88F # deprecated, use 0x0E49+0xF875 # Thai-0x84
- 0xF890 # deprecated, use 0x0E4A+0xF874 # Thai-0x9A
- 0xF891 # deprecated, use 0x0E4A+0xF873 # Thai-0x8A
- 0xF892 # deprecated, use 0x0E4A+0xF875 # Thai-0x85
- 0xF893 # deprecated, use 0x0E4B+0xF874 # Thai-0x9B
- 0xF894 # deprecated, use 0x0E4B+0xF873 # Thai-0x8B
- 0xF895 # deprecated, use 0x0E4B+0xF875 # Thai-0x86
- 0xF896 # deprecated, use 0x0E4C+0xF874 # Thai-0x9C
- 0xF897 # deprecated, use 0x0E4C+0xF873 # Thai-0x8C
- 0xF898 # deprecated, use 0x0E4C+0xF875 # Thai-0x87
- 0xF899 # deprecated, use 0x0E4D+0xF874 # Thai-0x8F
-
- # The following (6) are for the Mac OS Hebrew encoding. Four of
- # these are for the obsolete "canoral" codes that were used before
- # System 7.1/Worldscript to control positioning of nikud marks (points).
- # In the future these 4 code points may be redefined.
- # NOTE: Some of these are deprecated in favor of a combination of standard
- # character and transcoding hint. The deprecated characters will still
- # be loosely mapped to the appropriate Mac OS Hebrew character.
- 0xF89A # deprecated, use 0xF86A+0x05DC+0x05B9 # Hebrew-0xC0
- 0xF89B # Hebrew canoral 1 # Hebrew-0xC2
- 0xF89C # Hebrew canoral 2 # Hebrew-0xC3
- 0xF89D # Hebrew canoral 3 # Hebrew-0xC4
- 0xF89E # Hebrew canoral 4 # Hebrew-0xC5
- 0xF89F # deprecated, use 0x05B8+0xF87F # Hebrew-0xDE
-
- # The following (1) is for mapping the single undefined code point in
- # the Mac OS Greek and Turkish encodings, thus permitting full
- # round-trip fidelity. This character is also used for mapping EURO SIGN
- # when mapping to Unicode 1.1 (e.g. for Mac OS Roman and Symbol).
- 0xF8A0 # undefined1 # Greek-0xFF, Turkish-0xF5
- # also EURO SIGN for Unicode 1.1 # Roman-0xDB, Symbol-0xA0
-
- # The following (54) are for the Mac OS Japanese encoding.
- # part 1 - Apple corporate Unicode chars for Mac OS Japanese extended
- # characters not in Unicode.
- # NOTE: These are now all deprecated in favor of combinations of standard
- # characters and transcoding hints. The deprecated characters will still
- # be loosely mapped to the appropriate Mac OS Japanese character.
- 0xF8A1 # deprecated, use 0xF860+0x0030+0x002E # Jpn-0x8591
- 0xF8A2 # deprecated, use 0xF862+0x0058+0x0049+0x0049+0x0049 # Jpn-0x85AB
- 0xF8A3 # deprecated, use 0xF861+0x0058+0x0049+0x0056 # Jpn-0x85AC
- 0xF8A4 # deprecated, use 0xF860+0x0058+0x0056 # Jpn-0x85AD
- 0xF8A5 # deprecated, use 0xF862+0x0078+0x0069+0x0069+0x0069 # Jpn-0x85BF
- 0xF8A6 # deprecated, use 0xF861+0x0078+0x0069+0x0076 # Jpn-0x85C0
- 0xF8A7 # deprecated, use 0xF860+0x0078+0x0076 # Jpn-0x85C1
- 0xF8A8 # deprecated, use 0xFF4D+0xF87F # Jpn-0x8645
- 0xF8A9 # deprecated, use 0xFF47+0xF87F # Jpn-0x864B
- 0xF8AA # deprecated, use 0xFF4C+0xF87F # Jpn-0x8650
- 0xF8AB # deprecated, use 0xF860+0x0054+0x0042 # Jpn-0x865D
- 0xF8AC # deprecated, use 0xF861+0x0046+0x0041+0x0058 # Jpn-0x869E
- 0xF8AD # deprecated, use 0xF860+0x2193+0x2191 # Jpn-0x86CE
- 0xF8AE # deprecated, use 0x21E8+0xF87A # Jpn-0x86D3
- 0xF8AF # deprecated, use 0x21E6+0xF87A # Jpn-0x86D4
- 0xF8B0 # deprecated, use 0x21E7+0xF87A # Jpn-0x86D5
- 0xF8B1 # deprecated, use 0x21E9+0xF87A # Jpn-0x86D6
- 0xF8B2 # deprecated, use 0xF862+0x6709+0x9650+0x4F1A+0x793E # Jpn-0x87FB
- 0xF8B3 # deprecated, use 0xF862+0x8CA1+0x56E3+0x6CD5+0x4EBA # Jpn-0x87FC
- 0xF8B4 # deprecated, use 0x301E # Jpn-0x8855
- # part 2 - Apple corporate Unicode chars for Mac OS Japanese vertical
- # forms not in Unicode.
- # NOTE: These are now all deprecated in favor of combinations of standard
- # characters and transcoding hints. The deprecated characters will still
- # be loosely mapped to the appropriate Mac OS Japanese character.
- 0xF8B5 # deprecated, use 0x3001+0xF87E # Jpn-0xEB41
- 0xF8B6 # deprecated, use 0x3002+0xF87E # Jpn-0xEB42
- 0xF8B7 # deprecated, use 0x203E+0xF87E # Jpn-0xEB50
- 0xF8B8 # deprecated, use 0x30FC+0xF87E # Jpn-0xEB5B
- 0xF8B9 # deprecated, use 0x2010+0xF87E # Jpn-0xEB5D
- 0xF8BA # deprecated, use 0x301C+0xF87E # Jpn-0xEB60
- 0xF8BB # deprecated, use 0x2016+0xF87E # Jpn-0xEB61
- 0xF8BC # deprecated, use 0xFF5C+0xF87E # Jpn-0xEB62
- 0xF8BD # deprecated, use 0x22EF+0xF87E # Jpn-0xEB63
- 0xF8BE # deprecated, use 0xFF3B+0xF87E # Jpn-0xEB6D
- 0xF8BF # deprecated, use 0xFF3D+0xF87E # Jpn-0xEB6E
- 0xF8C0 # deprecated, use 0xFF1D+0xF87E # Jpn-0xEB81
- 0xF8C1 # deprecated, use 0x3041+0xF87E # Jpn-0xEC9F
- 0xF8C2 # deprecated, use 0x3043+0xF87E # Jpn-0xECA1
- 0xF8C3 # deprecated, use 0x3045+0xF87E # Jpn-0xECA3
- 0xF8C4 # deprecated, use 0x3047+0xF87E # Jpn-0xECA5
- 0xF8C5 # deprecated, use 0x3049+0xF87E # Jpn-0xECA7
- 0xF8C6 # deprecated, use 0x3063+0xF87E # Jpn-0xECC1
- 0xF8C7 # deprecated, use 0x3083+0xF87E # Jpn-0xECE1
- 0xF8C8 # deprecated, use 0x3085+0xF87E # Jpn-0xECE3
- 0xF8C9 # deprecated, use 0x3087+0xF87E # Jpn-0xECE5
- 0xF8CA # deprecated, use 0x308E+0xF87E # Jpn-0xECEC
- 0xF8CB # deprecated, use 0x30A1+0xF87E # Jpn-0xED40
- 0xF8CC # deprecated, use 0x30A3+0xF87E # Jpn-0xED42
- 0xF8CD # deprecated, use 0x30A5+0xF87E # Jpn-0xED44
- 0xF8CE # deprecated, use 0x30A7+0xF87E # Jpn-0xED46
- 0xF8CF # deprecated, use 0x30A9+0xF87E # Jpn-0xED48
- 0xF8D0 # deprecated, use 0x30C3+0xF87E # Jpn-0xED62
- 0xF8D1 # deprecated, use 0x30E3+0xF87E # Jpn-0xED83
- 0xF8D2 # deprecated, use 0x30E5+0xF87E # Jpn-0xED85
- 0xF8D3 # deprecated, use 0x30E7+0xF87E # Jpn-0xED87
- 0xF8D4 # deprecated, use 0x30EE+0xF87E # Jpn-0xED8E
- 0xF8D5 # deprecated, use 0x30F5+0xF87E # Jpn-0xED95
- 0xF8D6 # deprecated, use 0x30F6+0xF87E # Jpn-0xED96
-
- # The following (14) are for the Mac OS Dingbats encoding.
- # NOTE: These are now all deprecated in favor of standard characters or
- # combinations of standard characters and transcoding hints. The
- # deprecated characters will still be loosely mapped to the appropriate
- # Mac OS Dingbats character.
- 0xF8D7 # deprecated, use 0x0028 # Dingbats-0x80
- 0xF8D8 # deprecated, use 0x0029 # Dingbats-0x81
- 0xF8D9 # deprecated, use 0x0028+0xF87F # Dingbats-0x82
- 0xF8DA # deprecated, use 0x0029+0xF87F # Dingbats-0x83
- 0xF8DB # deprecated, use 0x3008 # Dingbats-0x84
- 0xF8DC # deprecated, use 0x3009 # Dingbats-0x85
- 0xF8DD # deprecated, use 0x2039 # Dingbats-0x86
- 0xF8DE # deprecated, use 0x203A # Dingbats-0x87
- 0xF8DF # deprecated, use 0x3008+0xF87C # Dingbats-0x88
- 0xF8E0 # deprecated, use 0x3009+0xF87C # Dingbats-0x89
- 0xF8E1 # deprecated, use 0x3014 # Dingbats-0x8A
- 0xF8E2 # deprecated, use 0x3015 # Dingbats-0x8B
- 0xF8E3 # deprecated, use 0x007B # Dingbats-0x8C
- 0xF8E4 # deprecated, use 0x007D # Dingbats-0x8D
-
- # The following (26) are for the Mac OS Symbol encoding.
- # NOTE: Some of these are deprecated in favor of combinations of standard
- # characters and transcoding hints. The deprecated characters will still
- # be loosely mapped to the appropriate Mac OS Symbol character.
- 0xF8E5 # radical extender # Symbol-0x60
- 0xF8E6 # vertical arrow extender # Symbol-0xBD
- 0xF8E7 # horizontal arrow extender # Symbol-0xBE
- 0xF8E8 # deprecated, use 0x00AE+0xF87F # Symbol-0xE2
- 0xF8E9 # deprecated, use 0x00A9+0xF87F # Symbol-0xE3
- 0xF8EA # deprecated, use 0x2122+0xF87F # Symbol-0xE4
- 0xF8EB # deprecated, use 0x0028+0xF870 # Symbol-0xE6
- 0xF8EC # deprecated, use 0x0028+0xF871 # Symbol-0xE7
- 0xF8ED # deprecated, use 0x0028+0xF872 # Symbol-0xE8
- 0xF8EE # deprecated, use 0x005B+0xF870 # Symbol-0xE9
- 0xF8EF # deprecated, use 0x005B+0xF871 # Symbol-0xEA
- 0xF8F0 # deprecated, use 0x005B+0xF872 # Symbol-0xEB
- 0xF8F1 # deprecated, use 0x007B+0xF870 # Symbol-0xEC
- 0xF8F2 # deprecated, use 0x007B+0xF871 # Symbol-0xED
- 0xF8F3 # deprecated, use 0x007B+0xF872 # Symbol-0xEE
- 0xF8F4 # curly bracket extender # Symbol-0xEF
- 0xF8F5 # deprecated, use 0x222B+0xF871 # Symbol-0xF4
- 0xF8F6 # deprecated, use 0x0029+0xF870 # Symbol-0xF6
- 0xF8F7 # deprecated, use 0x0029+0xF871 # Symbol-0xF7
- 0xF8F8 # deprecated, use 0x0029+0xF872 # Symbol-0xF8
- 0xF8F9 # deprecated, use 0x005D+0xF870 # Symbol-0xF9
- 0xF8FA # deprecated, use 0x005D+0xF871 # Symbol-0xFA
- 0xF8FB # deprecated, use 0x005D+0xF872 # Symbol-0xFB
- 0xF8FC # deprecated, use 0x007D+0xF870 # Symbol-0xFC
- 0xF8FD # deprecated, use 0x007D+0xF871 # Symbol-0xFD
- 0xF8FE # deprecated, use 0x007D+0xF872 # Symbol-0xFE
-
- # The following (1) is for the Mac OS Roman encoding
- # (also used in Symbol & Croatian).
- # NOTE: The graphic image associated with the Apple logo character is
- # not authorized for use without permission of Apple, and unauthorized
- # use might constitute trademark infringement.
- 0xF8FF # Apple logo # Roman-0xF0, Symbol-0xF0, Croatian-0xD8
-